Overview

Dataset statistics

Number of variables15
Number of observations3194
Missing cells4
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.4 MiB
Average record size in memory469.5 B

Variable types

Numeric8
DateTime1
Categorical6

Alerts

last_date has constant value "2022-03-07 00:00:00" Constant
state_name has a high cardinality: 52 distinct values High cardinality
county_name has a high cardinality: 1899 distinct values High cardinality
cases is highly correlated with deaths and 1 other fieldsHigh correlation
deaths is highly correlated with cases and 1 other fieldsHigh correlation
population is highly correlated with cases and 1 other fieldsHigh correlation
cases is highly correlated with deaths and 1 other fieldsHigh correlation
deaths is highly correlated with cases and 1 other fieldsHigh correlation
population is highly correlated with cases and 1 other fieldsHigh correlation
cases is highly correlated with deaths and 1 other fieldsHigh correlation
deaths is highly correlated with cases and 1 other fieldsHigh correlation
population is highly correlated with cases and 1 other fieldsHigh correlation
county_type is highly correlated with cbsa_rural and 2 other fieldsHigh correlation
cbsa_rural is highly correlated with county_type and 2 other fieldsHigh correlation
cbsa_county_type is highly correlated with county_type and 2 other fieldsHigh correlation
cbsa_type is highly correlated with county_type and 2 other fieldsHigh correlation
fips is highly correlated with state_name and 2 other fieldsHigh correlation
state_name is highly correlated with fips and 5 other fieldsHigh correlation
latitude is highly correlated with fips and 3 other fieldsHigh correlation
longitude is highly correlated with fips and 2 other fieldsHigh correlation
cbsa_rural is highly correlated with cbsa_type and 2 other fieldsHigh correlation
cbsa_type is highly correlated with state_name and 3 other fieldsHigh correlation
county_type is highly correlated with cbsa_rural and 2 other fieldsHigh correlation
cbsa_county_type is highly correlated with cbsa_rural and 2 other fieldsHigh correlation
cases is highly correlated with deaths and 1 other fieldsHigh correlation
deaths is highly correlated with cases and 1 other fieldsHigh correlation
population is highly correlated with cases and 1 other fieldsHigh correlation
cases_per_k is highly correlated with state_nameHigh correlation
deaths_per_k is highly correlated with state_name and 1 other fieldsHigh correlation
fips has unique values Unique
latitude has unique values Unique
longitude has unique values Unique
deaths has 96 (3.0%) zeros Zeros
deaths_per_k has 96 (3.0%) zeros Zeros

Reproduction

Analysis started2022-03-09 14:35:02.987722
Analysis finished2022-03-09 14:35:12.274711
Duration9.29 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

fips
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct3194
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31290.39167
Minimum1001
Maximum72153
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size25.1 KiB
2022-03-09T08:35:12.381047image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1001
5-th percentile5100.3
Q119023.5
median30006
Q346076.5
95-th percentile55025.7
Maximum72153
Range71152
Interquartile range (IQR)27053

Descriptive statistics

Standard deviation16280.83748
Coefficient of variation (CV)0.5203142759
Kurtosis-0.6124471655
Mean31290.39167
Median Absolute Deviation (MAD)12098
Skewness0.1714027627
Sum99941511
Variance265065669.2
MonotonicityNot monotonic
2022-03-09T08:35:12.499668image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
511991
 
< 0.1%
540791
 
< 0.1%
320171
 
< 0.1%
131151
 
< 0.1%
540711
 
< 0.1%
470591
 
< 0.1%
90131
 
< 0.1%
540671
 
< 0.1%
90091
 
< 0.1%
540631
 
< 0.1%
Other values (3184)3184
99.7%
ValueCountFrequency (%)
10011
< 0.1%
10031
< 0.1%
10051
< 0.1%
10071
< 0.1%
10091
< 0.1%
10111
< 0.1%
10131
< 0.1%
10151
< 0.1%
10171
< 0.1%
10191
< 0.1%
ValueCountFrequency (%)
721531
< 0.1%
721511
< 0.1%
721491
< 0.1%
721471
< 0.1%
721451
< 0.1%
721431
< 0.1%
721411
< 0.1%
721391
< 0.1%
721371
< 0.1%
721351
< 0.1%

last_date
Date

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size25.1 KiB
Minimum2022-03-07 00:00:00
Maximum2022-03-07 00:00:00
2022-03-09T08:35:12.649251image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:12.720567image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

state_name
Categorical

HIGH CARDINALITY
HIGH CORRELATION

Distinct52
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size203.4 KiB
Texas
254 
Georgia
 
159
Virginia
 
133
Kentucky
 
120
Missouri
 
115
Other values (47)
2413 

Length

Max length20
Median length8
Mean length8.178459612
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowSouth Carolina
2nd rowLouisiana
3rd rowVirginia
4th rowIdaho
5th rowIowa

Common Values

ValueCountFrequency (%)
Texas254
 
8.0%
Georgia159
 
5.0%
Virginia133
 
4.2%
Kentucky120
 
3.8%
Missouri115
 
3.6%
Kansas105
 
3.3%
Illinois102
 
3.2%
North Carolina100
 
3.1%
Iowa99
 
3.1%
Tennessee95
 
3.0%
Other values (42)1912
59.9%

Length

2022-03-09T08:35:12.817492image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
texas254
 
6.8%
virginia188
 
5.0%
georgia159
 
4.3%
north153
 
4.1%
carolina146
 
3.9%
new126
 
3.4%
kentucky120
 
3.2%
dakota119
 
3.2%
missouri115
 
3.1%
south112
 
3.0%
Other values (47)2233
59.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

county_name
Categorical

HIGH CARDINALITY

Distinct1899
Distinct (%)59.5%
Missing0
Missing (%)0.0%
Memory size199.9 KiB
Washington
 
30
Jefferson
 
26
Franklin
 
25
Jackson
 
24
Lincoln
 
24
Other values (1894)
3065 

Length

Max length35
Median length7
Mean length7.053537884
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1460 ?
Unique (%)45.7%

Sample

1st rowAbbeville
2nd rowAcadia
3rd rowAccomack
4th rowAda
5th rowAdair

Common Values

ValueCountFrequency (%)
Washington30
 
0.9%
Jefferson26
 
0.8%
Franklin25
 
0.8%
Jackson24
 
0.8%
Lincoln24
 
0.8%
Madison20
 
0.6%
Union18
 
0.6%
Montgomery18
 
0.6%
Clay18
 
0.6%
Monroe17
 
0.5%
Other values (1889)2974
93.1%

Length

2022-03-09T08:35:12.926876image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
washington30
 
0.9%
jefferson28
 
0.8%
franklin26
 
0.8%
st26
 
0.8%
jackson24
 
0.7%
lincoln24
 
0.7%
san21
 
0.6%
madison20
 
0.6%
clay18
 
0.5%
montgomery18
 
0.5%
Other values (1920)3200
93.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

latitude
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct3194
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.95260942
Minimum17.982429
Maximum69.31479216
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size25.1 KiB
2022-03-09T08:35:13.037993image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum17.982429
5-th percentile29.50182399
Q134.318339
median38.17733938
Q341.71439139
95-th percentile46.56236706
Maximum69.31479216
Range51.33236316
Interquartile range (IQR)7.396052395

Descriptive statistics

Standard deviation6.100628874
Coefficient of variation (CV)0.1607433314
Kurtosis2.554447873
Mean37.95260942
Median Absolute Deviation (MAD)3.700770815
Skewness-0.2866570824
Sum121220.6345
Variance37.21767266
MonotonicityNot monotonic
2022-03-09T08:35:13.161131image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
44.064823721
 
< 0.1%
48.822279761
 
< 0.1%
33.743149811
 
< 0.1%
41.02772781
 
< 0.1%
18.4256881
 
< 0.1%
40.161436911
 
< 0.1%
34.392870731
 
< 0.1%
36.140054591
 
< 0.1%
40.176379051
 
< 0.1%
42.386961361
 
< 0.1%
Other values (3184)3184
99.7%
ValueCountFrequency (%)
17.9824291
< 0.1%
17.9945251
< 0.1%
17.9984571
< 0.1%
18.0075161
< 0.1%
18.0103871
< 0.1%
18.0116611
< 0.1%
18.0178891
< 0.1%
18.031741
< 0.1%
18.0399421
< 0.1%
18.0409931
< 0.1%
ValueCountFrequency (%)
69.314792161
< 0.1%
67.049191961
< 0.1%
65.508154591
< 0.1%
64.903207241
< 0.1%
64.807262471
< 0.1%
63.876920951
< 0.1%
63.672640441
< 0.1%
62.313050451
< 0.1%
62.15429161
< 0.1%
61.166661
< 0.1%

longitude
Real number (ℝ)

HIGH CORRELATION
UNIQUE

Distinct3194
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-91.50699294
Minimum-174.1596
Maximum-65.28813
Zeros0
Zeros (%)0.0%
Negative3194
Negative (%)100.0%
Memory size25.1 KiB
2022-03-09T08:35:13.282494image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum-174.1596
5-th percentile-117.7959844
Q1-97.85755585
median-89.90057493
Q3-82.9498558
95-th percentile-74.00867694
Maximum-65.28813
Range108.87147
Interquartile range (IQR)14.90770005

Descriptive statistics

Standard deviation13.2725518
Coefficient of variation (CV)-0.1450441258
Kurtosis3.697462644
Mean-91.50699294
Median Absolute Deviation (MAD)7.453285195
Skewness-1.246728348
Sum-292273.3354
Variance176.1606312
MonotonicityNot monotonic
2022-03-09T08:35:13.405928image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-66.9196431
 
< 0.1%
-85.826303861
 
< 0.1%
-109.84646361
 
< 0.1%
-92.026400191
 
< 0.1%
-84.199657641
 
< 0.1%
-99.236923341
 
< 0.1%
-149.57417431
 
< 0.1%
-83.920971661
 
< 0.1%
-124.1572821
 
< 0.1%
-102.55554981
 
< 0.1%
Other values (3184)3184
99.7%
ValueCountFrequency (%)
-174.15961
< 0.1%
-164.03538041
< 0.1%
-163.39678831
< 0.1%
-162.89051961
< 0.1%
-161.97220211
< 0.1%
-159.85618311
< 0.1%
-159.75039461
< 0.1%
-159.59667861
< 0.1%
-158.23819421
< 0.1%
-157.97121821
< 0.1%
ValueCountFrequency (%)
-65.288131
< 0.1%
-65.4409711
< 0.1%
-65.6664161
< 0.1%
-65.6668661
< 0.1%
-65.7250971
< 0.1%
-65.7538971
< 0.1%
-65.8101741
< 0.1%
-65.8137421
< 0.1%
-65.8694681
< 0.1%
-65.8876121
< 0.1%

cbsa_rural
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size191.7 KiB
CBSA
1870 
Rural
1324 

Length

Max length5
Median length4
Mean length4.414527239
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCBSA
2nd rowCBSA
3rd rowRural
4th rowCBSA
5th rowRural

Common Values

ValueCountFrequency (%)
CBSA1870
58.5%
Rural1324
41.5%

Length

2022-03-09T08:35:13.516110image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-03-09T08:35:13.575409image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
cbsa1870
58.5%
rural1324
41.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

cbsa_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size237.3 KiB
Other
1324 
Metropolitan Statistical Area
1228 
Micropolitan Statistical Area
642 

Length

Max length29
Median length29
Mean length19.05134627
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMicropolitan Statistical Area
2nd rowMetropolitan Statistical Area
3rd rowOther
4th rowMetropolitan Statistical Area
5th rowOther

Common Values

ValueCountFrequency (%)
Other1324
41.5%
Metropolitan Statistical Area1228
38.4%
Micropolitan Statistical Area642
20.1%

Length

2022-03-09T08:35:13.639347image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-03-09T08:35:13.697817image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
area1870
27.0%
statistical1870
27.0%
other1324
19.1%
metropolitan1228
17.7%
micropolitan642
 
9.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

county_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size197.7 KiB
Central
1330 
Other
1324 
Outlying
540 

Length

Max length8
Median length7
Mean length6.340012523
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOutlying
2nd rowOutlying
3rd rowOther
4th rowCentral
5th rowOther

Common Values

ValueCountFrequency (%)
Central1330
41.6%
Other1324
41.5%
Outlying540
16.9%

Length

2022-03-09T08:35:13.768958image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-03-09T08:35:13.833015image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
central1330
41.6%
other1324
41.5%
outlying540
16.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

cbsa_county_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size210.5 KiB
Other
1324 
Metro: Central
780 
Micro: Central
550 
Metro: Outlying
448 
Micro: Outlying
 
92

Length

Max length15
Median length14
Mean length10.43832185
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMicro: Outlying
2nd rowMetro: Outlying
3rd rowOther
4th rowMetro: Central
5th rowOther

Common Values

ValueCountFrequency (%)
Other1324
41.5%
Metro: Central780
24.4%
Micro: Central550
17.2%
Metro: Outlying448
 
14.0%
Micro: Outlying92
 
2.9%

Length

2022-03-09T08:35:13.901622image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-03-09T08:35:13.961365image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
central1330
26.3%
other1324
26.1%
metro1228
24.2%
micro642
12.7%
outlying540
10.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

cases
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2926
Distinct (%)91.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24530.32498
Minimum32
Maximum2805119
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size25.1 KiB
2022-03-09T08:35:14.167751image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum32
5-th percentile616.9
Q12581.25
median6244.5
Q316450.75
95-th percentile101889.2
Maximum2805119
Range2805087
Interquartile range (IQR)13869.5

Descriptive statistics

Standard deviation83798.38078
Coefficient of variation (CV)3.416113763
Kurtosis428.4296948
Mean24530.32498
Median Absolute Deviation (MAD)4543
Skewness16.21893143
Sum78349858
Variance7022168621
MonotonicityNot monotonic
2022-03-09T08:35:14.281142image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
22684
 
0.1%
19583
 
0.1%
20273
 
0.1%
21183
 
0.1%
28433
 
0.1%
44393
 
0.1%
20143
 
0.1%
11473
 
0.1%
10533
 
0.1%
21363
 
0.1%
Other values (2916)3163
99.0%
ValueCountFrequency (%)
321
< 0.1%
351
< 0.1%
411
< 0.1%
631
< 0.1%
651
< 0.1%
671
< 0.1%
741
< 0.1%
801
< 0.1%
821
< 0.1%
871
< 0.1%
ValueCountFrequency (%)
28051191
< 0.1%
12488181
< 0.1%
11788271
< 0.1%
11145521
< 0.1%
9946771
< 0.1%
7913521
< 0.1%
6872641
< 0.1%
6354461
< 0.1%
6152961
< 0.1%
5996981
< 0.1%

deaths
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct757
Distinct (%)23.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean294.5544771
Minimum0
Maximum31046
Zeros96
Zeros (%)3.0%
Negative0
Negative (%)0.0%
Memory size25.1 KiB
2022-03-09T08:35:14.398348image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4
Q136
median89
Q3219.75
95-th percentile1123.05
Maximum31046
Range31046
Interquartile range (IQR)183.75

Descriptive statistics

Standard deviation977.429549
Coefficient of variation (CV)3.318332006
Kurtosis365.7224728
Mean294.5544771
Median Absolute Deviation (MAD)66
Skewness15.18832301
Sum940807
Variance955368.5233
MonotonicityNot monotonic
2022-03-09T08:35:14.511924image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
096
 
3.0%
1132
 
1.0%
928
 
0.9%
826
 
0.8%
5525
 
0.8%
1925
 
0.8%
4624
 
0.8%
2424
 
0.8%
2824
 
0.8%
1623
 
0.7%
Other values (747)2867
89.8%
ValueCountFrequency (%)
096
3.0%
121
 
0.7%
215
 
0.5%
319
 
0.6%
416
 
0.5%
518
 
0.6%
615
 
0.5%
723
 
0.7%
826
 
0.8%
928
 
0.9%
ValueCountFrequency (%)
310461
< 0.1%
156251
< 0.1%
141231
< 0.1%
127321
< 0.1%
117751
< 0.1%
107241
< 0.1%
103201
< 0.1%
77131
< 0.1%
76981
< 0.1%
75701
< 0.1%

population
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3138
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean103654.6359
Minimum169
Maximum10039107
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size25.1 KiB
2022-03-09T08:35:14.632722image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum169
5-th percentile2831.55
Q111195.25
median26274
Q367625.75
95-th percentile441910.5
Maximum10039107
Range10038938
Interquartile range (IQR)56430.5

Descriptive statistics

Standard deviation330850.5584
Coefficient of variation (CV)3.191854909
Kurtosis305.8528033
Mean103654.6359
Median Absolute Deviation (MAD)18870.5
Skewness13.53821422
Sum331072907
Variance1.09462092 × 1011
MonotonicityNot monotonic
2022-03-09T08:35:14.754410image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
146513
 
0.1%
25303
 
0.1%
81792
 
0.1%
23562
 
0.1%
28792
 
0.1%
38382
 
0.1%
139432
 
0.1%
121962
 
0.1%
62292
 
0.1%
17942
 
0.1%
Other values (3128)3172
99.3%
ValueCountFrequency (%)
1691
< 0.1%
2721
< 0.1%
4041
< 0.1%
4631
< 0.1%
4651
< 0.1%
4871
< 0.1%
4941
< 0.1%
5791
< 0.1%
6231
< 0.1%
6251
< 0.1%
ValueCountFrequency (%)
100391071
< 0.1%
51502331
< 0.1%
47133251
< 0.1%
44854141
< 0.1%
33383301
< 0.1%
31756921
< 0.1%
27169401
< 0.1%
26355161
< 0.1%
25599031
< 0.1%
24705461
< 0.1%

cases_per_k
Real number (ℝ≥0)

HIGH CORRELATION

Distinct3191
Distinct (%)> 99.9%
Missing2
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean240.2849101
Minimum51.34281201
Maximum1617.647059
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size25.1 KiB
2022-03-09T08:35:14.870951image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum51.34281201
5-th percentile148.4619795
Q1205.5453362
median241.0387501
Q3272.9211987
95-th percentile322.7689021
Maximum1617.647059
Range1566.304247
Interquartile range (IQR)67.37586254

Descriptive statistics

Standard deviation61.05925171
Coefficient of variation (CV)0.2541118861
Kurtosis83.36423366
Mean240.2849101
Median Absolute Deviation (MAD)33.86536733
Skewness4.020983894
Sum766989.433
Variance3728.232219
MonotonicityNot monotonic
2022-03-09T08:35:14.989336image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
167.04288942
 
0.1%
238.18188061
 
< 0.1%
211.90869091
 
< 0.1%
198.33317731
 
< 0.1%
347.40259741
 
< 0.1%
183.16939891
 
< 0.1%
119.48249621
 
< 0.1%
288.89462241
 
< 0.1%
250.4240651
 
< 0.1%
215.46379691
 
< 0.1%
Other values (3181)3181
99.6%
(Missing)2
 
0.1%
ValueCountFrequency (%)
51.342812011
< 0.1%
69.932617011
< 0.1%
71.25755331
< 0.1%
76.555023921
< 0.1%
78.293511291
< 0.1%
80.619155111
< 0.1%
81.018518521
< 0.1%
81.83532371
< 0.1%
87.287490481
< 0.1%
89.355426391
< 0.1%
ValueCountFrequency (%)
1617.6470591
< 0.1%
705.37010161
< 0.1%
631.83802361
< 0.1%
584.72727271
< 0.1%
558.43694491
< 0.1%
555.86901761
< 0.1%
553.78846651
< 0.1%
541.96227721
< 0.1%
536.80952831
< 0.1%
493.23725061
< 0.1%

deaths_per_k
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct3081
Distinct (%)96.5%
Missing2
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean3.454693529
Minimum0
Maximum13.59516616
Zeros96
Zeros (%)3.0%
Negative0
Negative (%)0.0%
Memory size25.1 KiB
2022-03-09T08:35:15.112500image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.827986562
Q12.401081901
median3.413755612
Q34.472142439
95-th percentile6.162881628
Maximum13.59516616
Range13.59516616
Interquartile range (IQR)2.071060538

Descriptive statistics

Standard deviation1.629569511
Coefficient of variation (CV)0.4716972714
Kurtosis1.055820508
Mean3.454693529
Median Absolute Deviation (MAD)1.044616124
Skewness0.3849434644
Sum11027.38174
Variance2.655496791
MonotonicityNot monotonic
2022-03-09T08:35:15.234787image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
096
 
3.0%
3.6242826943
 
0.1%
5.0415931432
 
0.1%
1.5663069962
 
0.1%
3.0075187972
 
0.1%
2.7210884352
 
0.1%
1.7190569742
 
0.1%
5.3404539392
 
0.1%
4.0785113432
 
0.1%
3.9043435822
 
0.1%
Other values (3071)3077
96.3%
ValueCountFrequency (%)
096
3.0%
0.12141079341
 
< 0.1%
0.13912075681
 
< 0.1%
0.13997760361
 
< 0.1%
0.26367831251
 
< 0.1%
0.27270248161
 
< 0.1%
0.33502708141
 
< 0.1%
0.3478260871
 
< 0.1%
0.3527585721
 
< 0.1%
0.37369207771
 
< 0.1%
ValueCountFrequency (%)
13.595166161
< 0.1%
11.44923171
< 0.1%
10.380622841
< 0.1%
10.193321621
< 0.1%
9.8570724491
< 0.1%
9.8494703591
< 0.1%
9.8039215691
< 0.1%
9.799346711
< 0.1%
9.1992158051
< 0.1%
9.1092230881
< 0.1%

Interactions

2022-03-09T08:35:10.667173image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:04.742969image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:05.606458image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:06.501337image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:07.295350image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:08.101639image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:08.936315image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:09.829841image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:10.775199image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:04.904065image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:05.710613image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:06.602408image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:07.397660image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:08.205801image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:09.152460image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:09.932647image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:10.876989image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:05.001297image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:05.805765image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:06.697684image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:07.493276image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:08.305604image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:09.250197image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:10.030148image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:10.972204image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:05.098989image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:05.901859image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:06.793115image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:07.590243image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:08.411152image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:09.348818image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:10.128266image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:11.072172image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:05.197335image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:05.997543image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:06.890253image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:07.686592image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:08.514951image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:09.443404image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:10.228019image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:11.179598image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:05.295971image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:06.096274image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:06.989119image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:07.785708image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:08.613703image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:09.538507image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:10.336876image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:11.280798image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:05.397819image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:06.299790image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:07.086579image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:07.886585image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:08.715306image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:09.632186image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:10.443403image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:11.385085image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:05.503569image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:06.402095image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:07.192415image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:07.996182image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:08.828150image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:09.732588image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-03-09T08:35:10.557941image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2022-03-09T08:35:15.334981image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-03-09T08:35:15.476725image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-03-09T08:35:15.602398image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-03-09T08:35:15.722369image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-03-09T08:35:15.838186image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-03-09T08:35:11.586693image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-03-09T08:35:11.943742image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-03-09T08:35:12.088377image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-03-09T08:35:12.172116image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

fipslast_datestate_namecounty_namelatitudelongitudecbsa_ruralcbsa_typecounty_typecbsa_county_typecasesdeathspopulationcases_per_kdeaths_per_k
0450012022-03-07South CarolinaAbbeville34.223334-82.461707CBSAMicropolitan Statistical AreaOutlyingMicro: Outlying66156724527268.2808132.717281
1220012022-03-07LouisianaAcadia30.295065-92.414197CBSAMetropolitan Statistical AreaOutlyingMetro: Outlying1498828762045239.5473724.587009
2510012022-03-07VirginiaAccomack37.767072-75.632346RuralOtherOtherOther691510332316211.1966283.145807
3160012022-03-07IdahoAda43.452658-116.241552CBSAMetropolitan Statistical AreaCentralMetro: Central131024992481587293.7415372.223956
4190012022-03-07IowaAdair41.330756-94.471059RuralOtherOtherOther1577507152221.3644027.018529
5210012022-03-07KentuckyAdair37.104598-85.281297RuralOtherOtherOther55499919202288.3945745.145263
6290012022-03-07MissouriAdair40.190586-92.600782CBSAMicropolitan Statistical AreaCentralMicro: Central54756125343216.1895362.408687
7400012022-03-07OklahomaAdair35.884942-94.658593RuralOtherOtherOther78086922194353.0954643.120336
880012022-03-07ColoradoAdams39.874321-104.336258CBSAMetropolitan Statistical AreaCentralMetro: Central1290911234517421259.6803562.482323
9160032022-03-07IdahoAdams44.893336-116.454525RuralOtherOtherOther692154294172.1821353.732272

Last rows

fipslast_datestate_namecounty_namelatitudelongitudecbsa_ruralcbsa_typecounty_typecbsa_county_typecasesdeathspopulationcases_per_kdeaths_per_k
3184450912022-03-07South CarolinaYork34.972815-81.180859CBSAMetropolitan Statistical AreaOutlyingMetro: Outlying77271617280979298.7577382.385546
3185511992022-03-07VirginiaYork37.243748-76.544128CBSAMetropolitan Statistical AreaCentralMetro: Central966510568280143.0008731.553553
3186485032022-03-07TexasYoung33.176597-98.687909RuralOtherOtherOther38858918010214.4749924.913327
318761152022-03-07CaliforniaYuba39.262559-121.353564CBSAMetropolitan Statistical AreaCentralMetro: Central1686911578668223.4511811.523320
318822902022-03-07AlaskaYukon-Koyukuk65.508155-151.390739RuralOtherOtherOther133295230245.9833801.662050
318940272022-03-07ArizonaYuma32.768957-113.906667CBSAMetropolitan Statistical AreaCentralMetro: Central621711116213787299.1449705.369799
319081252022-03-07ColoradoYuma40.003468-102.425867RuralOtherOtherOther18492310019183.6329332.284239
3191485052022-03-07TexasZapata27.001564-99.169872CBSAMicropolitan Statistical AreaCentralMicro: Central36525214179254.1582573.618902
3192485072022-03-07TexasZavala28.866172-99.760508RuralOtherOtherOther40256711840331.7945765.523040
3193461372022-03-07South DakotaZiebach44.978819-101.665462RuralOtherOtherOther666112756236.6737743.909026